Predictive models for diagnosis of middle ear disease in infancy: An overview of my PhD project

Joshua Myers

1. Introduction

Infants with onset of middle ear disease early in life are at greater risk of recurrent and chronic infections throughout childhood that can affect language and development. Diagnostic tools for identifictaion of middle ear disease in infancy could help provide timely diagnosis and appropriate management for affected children.

Wideband acoustic immittance (WAI) is an innovative, high-resolution test of middle ear function that captures detailed information about the middle ear. The term WAI actually refers to a family of broadband tests including absorbance and admittance. Absorbance is the proportion of energy that is transferred through the middle ear. Admittance is a complex measurement with magnitude and phase. WAI is measured as a function of frequency (pitch) in the ear.

Preliminary research has shown WAI to be a promising test of middle ear function, but an issue with clinical implementation, however, is that the amount of information generated by the test can make interpretation of results difficult. Currently, clinicians need to subjectively interpret results, which requires significant expertise, limiting the usefulness of the test. The goal of my study was to develop objective methods for interpretation of WAI results in infancy.

Research question

How can we objectively diagnose middle ear disease in infants using the innovative WAI technology?

Overview of the study

The study was a large longitudinal study funded by the National Health and Medical Council. We recritied 753 infants at birth and followed them through infancy.

Data science issues

Data reduction was an important aspect of the research. The study was the largest of its kind to date, but the number of variables generated by the WAI test meant that we needed to be careful not to overfit the data. Data reduction strategies included data mining, averaging results together, principal component analysis, and using knowledge from the literature. I utilized sophisticated data wrangling and visualization techniques to prepare and present the data, including cleaning messy text data, and dealing with missing values. I needed to develop interpretable models that clinicians would be able to understand, and be confident to use in their clinical work. A statistical issue was that multiple measurements were made on each subject. Both ears of an infant were tested, the infants were tested on multiple occasions through infancy. To take these issues into account, I used techniqes such as heirachical (mixed effects) models, and robust standard errors for statistical inference.

A statistical modelling issue was that there were large age effects on WAI results through infancy which needed to be taken into account when developing models. I first developed age-specific models for newborns, 6-month, and 12-month infants. I then did a longitudinal study to investigate age effects, and used this knowledge to develop a model for 6-18 month infants that controlled for developmental effects. Finally, I validated the newborn model in a new sample.

To make the models accessible, I implemented them in an online web application.

2. The newborn model

For the newborn model I allowed predictors to have a nonlinear association with the outcome using restricted cubic splines. I developed the model on data from one ear of subjects, and validated for overfitting using the opposite ear data, and with bootstrap resampling. Below is a figure of the raw WAI data, showing the distribution of results for normal ears and ears with middle ear disease.

Figure 1. Average (median, solid and dashed lines), and distribution (interqurtile range, shaded regions) for normal (pass) and diseased ears (fail) for absorbance (top), admittance magnitude (middle), and admittance phase angle (bottom).

Figure 1. Average (median, solid and dashed lines), and distribution (interqurtile range, shaded regions) for normal (pass) and diseased ears (fail) for absorbance (top), admittance magnitude (middle), and admittance phase angle (bottom).

This figure shows results of three WAI tests - absorbance (top), admittance magnitude (middle), and admittance phase angle (bottom). We would expect results to be more predictive of middle ear dysfunction where there is less overlap between the normal and diseased groups. For example, absorbance (top) in the region of 2000 Hz may be quite predictive, since there is good separation between the normal and diseased groups.

The data were first reduced by averaging into octave bandwidths, and predictors for the model were chosen based on previous research in the literature. Predictors were allowed to have a nonlinear association with the outcome (middle ear disease) using restricted cubic splines. Model performance was assessed with AUC, which is a number between 0 and 1 - AUC of 1 perfectly discriminates between normal and diseased subjects, and AUC of 0.5 is no better than chance. AUC results were 0.88 for the fitted model, 0.9 for the validation sample (opposite ears), and 0.85 after correcting for overfitting with bootstrap resampling. These results indicate that the model was not overfitting the data, and may generalize well to new samples. This model was published in Myers et al. (2018a)

3. The 6-month model

The 6-month model only used absorbance results as predictors to limit the potential number of potential variables. The figure below shows the average distribution of absorbance for normal and disesed ears. Again, the region around 1000 to 2000 Hz shows good separation between the normal and diseased groups.

Figure 2. Average absorbance (median, the lines) and distribution (interquartile range, shaded regions) for normal and diseased ears.

Figure 2. Average absorbance (median, the lines) and distribution (interquartile range, shaded regions) for normal and diseased ears.

For this model I performed initial data reduction with 1/2 octave frequency averaging, and then did prinicpal component analysis to further reduce the number of variables. The first five principal components were used as predictors in the model. The figure below shows the importance of absorbance predictors in the first five principal components. A darker shade indicates a more important variable for that component. For example, in the first principal component (PC1, left-hand side), 2000 to 8000 Hz have darker shading, and are therefore contributing more to that principal component.

Figure 3. Principal component analysis results showing the contribution of absorbance variables to each of the first five principal components (PCs).

Figure 3. Principal component analysis results showing the contribution of absorbance variables to each of the first five principal components (PCs).

The AUC of the fitted model was 0.89, 0.85 when applied to the validation sample (the opposite ears), and 0.87 after being corrected for bias with bootstrap resampling. The model was published in Myers et al. (2018b).

4. The 12-month model

The 12-month model was an ordinal model, again using absorbance results as predictors. The ordinal target condition for this model was: normal, mild or severe middle ear disease. I took this approach because prior research has shown an ordinal association between absorbance results and middle ear condition, and clinicians may benefit from the extra information of knowing whether the disease is likely to be mild or severe in nature. The figure below shows absorbance results stratified by the target condition (normal, mild, severe), and does appear to show an ordinal association (i.e., a systematic decrease in absorbance from normal through to mild and then severe). This was further assessed with statistical graphs which are not shown, but confirmed the ordinal relationship in the data.

Figure 4. Absorbance stratified by levels of the target condition (middle ear disease).

Figure 4. Absorbance stratified by levels of the target condition (middle ear disease).

Data reduction for this model again used 1/2 octave averaging, and predictors were selected based on prior research. I used Huber-White robust covariance estimates to account for multiple measurements, as results from both ears were used to develop the model. The AUC for the model was 0.92, and 0.91 after being corrected for overfitting using bootstrap resampling. This model has been submitted for publication in Myers et al. (2019a).

5. The longitudinal study

The longitudinal study compared results from normal ears longitudinally through infancy to investigate the effect of age on WAI results. Mixed effects models were used to account for correlations between the data points (i.e., multiple measurements made on infants). I presented results by calculating the model (least squares) means, and confidence intervals. Non-overlapping confidence intervals between age groups was evidence of a genuine developmental difference between age groups. The figure below shows the model means and confidence intervals for the various age groups. For example, for absorbance (top panel), results for newborns do not overlap other age groups except for around 2000 and 6000 Hz.

Figure 5. Model (least squares) means (lines) and 95% confidence intervals (shaded regions) by age for absorbance (top panel), admittance magnitude (middle panel), and normalized admittance magnitude (bottom panel).

Figure 5. Model (least squares) means (lines) and 95% confidence intervals (shaded regions) by age for absorbance (top panel), admittance magnitude (middle panel), and normalized admittance magnitude (bottom panel).

This study has been submitted for publication in Myers et al. (2019b).

6. The model accounting for age and the validation study

I used the knowledge gained from the longitudinal study to develop a model for infants aged 6 to 18 months that controlled for developmental effects. This was again an ordinal model using 1/2 octave averaged absorbance results, similar to the 12-month model. I again used Huber-White robust covariance estimates in this model to account for multiple measurements made on subjects through infancy. The AUC results for this model were 0.88, and 0.87 after being corrected for bias using bootstrap resampling. Applying the model to the opposite ears yielded AUC of 0.89.

Examples of applying the model to absorbance results from two infants are shown below. Each graph plots the absorbance test result (the line) against the 90% normal range (shaded region), and presents the probability that the middle ear (ME) of the subject has either mild or severe dysfunction. The likely diagnosis for the infant in the top panel is severe dysfunction, since the probability of both mild and severe dysfunction are high. The most likely diagnosis for the case in the bottom panel is mild dysfunction, since there is a high likelihood of mild dysfunction, but not severe disease.

Figure 6. Examples of applying the model to specific cases.

Figure 6. Examples of applying the model to specific cases.

I also validated the newborn model in a new sample of 124 newborn infants. The newborn model had an AUC of 0.84 when applied to a new sample, which was very close to the validation sample results from the initial model of 0.85, showing that the model generalized effectively to new subjects. These studies have been submitted for publication in Myers et al. (2019c).

7. Conclusions

I developed predictive models for objective diagnosis of middle ear disease in infancy. The models could be fully automated for use in screening settings, or risk scores could be presented to clinicians along with graphical results to aid in clincial diagnosis.

I have made the models available to clinicians by implementing them in a web application which is available here. To try out the model choose an age group from the “Select age” menu, then click one of the example files to download to your compluter (this is the test result that is saved by the clinican), and then upload the file with the “Browse” button to make a prediction based on the results saved in the file.

8. References

Myers, J., Kei, J., Aithal, S., Aithal, V., Driscoll, C., Khan, A., Manual, A., Joseph, A., Malicka, A. N. (2018a). Development of a diagnostic prediction model for conductive conditions in neonates using wideband acoustic immittance. Ear and Hearing, 39(6), 1116-1135.

Myers, J., Kei, J., Aithal, S., Aithal, V., Driscoll, C., Khan, A., Manual, A., Joseph, A., Malicka, A. N. (2018b). Diagnosing middle ear pathology in 6- to 9-month-old infants using wideband absorbance: A risk prediction model. Journal of Speech Language and Hearing Research, 61(9), 2386-2404.

Myers, J., Kei, J., Aithal, S., Aithal, V., Driscoll, C., Khan, A., Manual, A., Joseph, A., Malicka, A. N. (2019a). Diagnosing middle ear dysfunction in 10- to 16-month-old infants using wideband absorbance: An ordinal prediction model. Manuscript submitted to Journal of Speech Language and Hearing Research.

Myers, J., Kei, J., Aithal, S., Aithal, V., Driscoll, C., Khan, A., Manual, A., Joseph, A., Malicka, A. N. (2019b). Longitudinal development of wideband absorbance and admittance through infancy. Manuscript submitted to Journal of Speech Language and Hearing Research.

Myers, J., Kei, J., Aithal, S., Aithal, V., Driscoll, C., Khan, A., Manual, A., Joseph, A., Malicka, A. N. (2019c). Diagnosing conductive dysfunction in infants using wideband acoustic immittance: Validation and development of predictive models. Manuscript submitted to Journal of Speech Language and Hearing Research.